TWC: Frontier: Privacy for Social Science Research

ثبت نشده
چکیده

Information technology, advances in statistical computing, and the deluge of data available through the Internet are transforming social science. With the ability to collect and analyze massive amounts of data on human behavior and interactions, social scientists can hope to uncover many more phenomena, with greater detail and confidence, than allowed by traditional means such as surveys and interviews. In addition to advancing the state of knowledge, the rich analysis of behavioral data can enable companies to better serve their customers, and governments their citizenry. However, a major challenge for computational social science is maintaining the privacy of human subjects. At present, an individual social science researcher is left to devise her own privacy shields, such as stripping the dataset of “personally identifiable information” (PII). However, such privacy shields are often ineffective and provide limited or no real-world privacy protection. Indeed, there have been a number of cases where the individuals in a supposedly anonymized dataset have been re-identified. At the same time, social scientists are increasingly analyzing complex forms of data, such as large social networks, spatial trajectories, and semistructured text, that are even less amenable to naive attempts at anonymization. Beyond harm that may be suffered by the subjects themselves, such privacy violations are a serious threat to the future of computational social science research. After a few serious and highly publicized incidents, it may become much harder for researchers to obtain good social science data. Subjects may be reluctant to participate in experiments, data holders may become subject to stifling regulation, and companies may refuse to share proprietary data out of fear of lawsuits or bad public relations. This project is a broad, multidisciplinary effort to help enable the collection, analysis, and sharing of social science data while providing privacy for individual subjects. Bringing together computer science, social science, statistics, and law, the investigators seek to refine and develop definitions and measures of privacy and data utility, and design an array of technological, legal, and policy tools for social scientists to use when dealing with sensitive data. These tools will be tested and deployed at the Harvard Institute for Quantitative Social Science’s Dataverse Network, an open-source digital repository that offers the largest catalogue of social science datasets in the world. Our aim is to provide social scientists with a technological and legal framework that embodies the modern computational understanding of privacy, and a reliable open infrastructure that aids in the management of confidential research data from collection through dissemination.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Self-Disclosure and Offline Threat Detection

Human beings have evolved to detect and react to threats in their physical environment, and have developed perceptual systems to assess physical, sensorial stimuli for current, material risks. In cyberspace, those stimuli can be absent, subdued, or deliberately manipulated by antagonistic third parties. Security and privacy concerns that would normally be activated in the offline world, therefo...

متن کامل

Open Data, Grey Data, and Stewardship: Universities at the Privacy Frontier

.......................................................................................................................................... 2 Framing the problem ....................................................................................................................... 3 The Data-Rich World of Research Universities ........................................................................

متن کامل

Introduction to special issue on information privacy and trust in social media

The Web 2.0 indicates an emerging ‘social’ approach to generating and distributing Web content, characterized by open communication, decentralization of authority, and freedom to share and reuse [1]. The formation of dynamic coalitions aiming to share services and data can benefit from semantic web applications to facilitate cooperation across entities and users using different technologies. In...

متن کامل

Analyzing Tools and Algorithms for Privacy Protection and Data Security in Social Networks

The purpose of this research, is to study factors influencing privacy concerns about data security and protection on social network sites and its’ influence on self-disclosure. 100 articles about privacy protection, data security, information disclosure and Information leakage on social networks were studied. Models and algorithms types and their repetition in articles have been distinguished a...

متن کامل

A Sudy on Information Privacy Issue on Social Networks

In the recent years, social networks (SN) are now employed for communication and networking, socializing, marketing, as well as one’s daily life. Billions of people in the world are connected though various SN platforms and applications, which results in generating massive amount of data online. This includes personal data or Personally Identifiable Information (PII). While more and more data a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012